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Abstract 

In this paper we analyze the use of Chebyshev polynomials in distributed consensus applications. We 
study the properties of these polynomials to propose a distributed algorithm that reaches the consensus 
in a fast way. The algorithm is expressed in the form of a linear iteration and, at each step, the 
agents only require to transmit their current state to their neighbors. The difference with respect to 
previous approaches is that the update rule used by the network is based on the second order difference 
equation that describes the Chebyshev polynomials of first kind. As a consequence, we show that our 
algorithm achieves the consensus using far less iterations than other approaches. We characterize the 
main properties of the algorithm for both, fixed and switching communication topologies. The main 
contribution of the paper is the study of ihe properties of the Chebyshev polynomials in distributed 
consensus applications, proposing an algorithm that increases the convergence rate with respect to 
existing approaches. Theoretical results, as well as experiments with synthetic data, show the benefits 
using our algorithm. 

Index Terms - Chebyshev polynomials, distributed consensus, convergence rate. 

I. Introduction 

Chebyshev polynomials [1] are a powerful mathematical tool that has proven to be very helpful 
in many different fields of science. To name a few, they are used in the modeling of complex 
chemical reaction systems [2], the simulation satellite orbits around the Earth. [3], the numerical 
solution of diffusion-reactions equations with severely stiff reaction terms [4] or the recognition 
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23 of patterns in images using Support Vector Machine classification [5]. In this paper we study 

24 the use of these polynomials in the field of distributed consensus applications. 

25 In sensor networks and multi-agent systems, the consensus problem consists of making the 

26 whole group of agents to reach a common estimation about a specific measurement. Within the 

27 control community many different distributed solutions have been proposed in the past years [6]- 

28 [12]. It is well known that the number of messages required to achieve the consensus depends 

29 on the network connectivity. Interesting analysis of convergence have been done in [13], [14], 

30 where consensus methods have been shown to behave in a similar manner as heat differential 

31 equations and electrical resistive networks respectively. Other interesting approaches analyze the 

32 convergence with stochastic link failures [15], switching random networks [16] and asynchronous 

33 consensus [17]. When the size of the network is large, communications between different pairs of 

34 agents become more difficult due to distance and power constraints. Under these circumstances 

35 the number of iterations required to reach the consensus is also large. For that reason a lot of 

36 research has been devoted to mitigate this problem, providing a variety of solutions that reduce 

37 the time to achieve the consensus. 

38 Some works present continuous-time solutions to achieve consensus in finite time using non 

39 linear methods [18]-[20]. The use of numerical integrators affects the number of iterations in 

40 these approaches because they depend on the number of steps taken by the method. The approach 

41 in [21] proposes a link scheduling that reaches the consensus in a finite number of steps. However, 

42 in wireless networks, communications of direct neighbors depend on the distance that separates 

43 them and therefore, there might be situations in which this method cannot be used because not all 

44 the links are feasible. Other approaches speed up convergence by sending additional information 

45 in the messages. Following this idea a multi-hop protocol is presented in [22] and second order 

46 neighbors are considered in [23]. Unfortunately, the amount of additional information in both 

47 cases depends on the topology. This implies that there might be situations in which large messages 

48 must be sent. 

49 The design of the adjacency matrix has been the focus of several works. For instance, the 

50 work in [24] provides the optimal weights for the matrix, as well as good approximations that 

51 do not require any global knowledge about the network topology. Different algorithms to solve 

52 the optimization problem of finding the best matrix are proposed in [25]. Another optimization 

53 method is proposed in [26], in this case considering a shift-registers method with a fixed gain. 



54 These approaches indeed improve the convergence speed, nevertheless, they can still be combined 

55 with additional techniques in order to accelerate even more the consensus. 

56 The distributed evaluation of polynomials, as well as the use of previous information in the 

57 algorithm, have turned out to be easy ways to speed up the consensus, also keeping the good 

58 properties found in standard methods. The minimal polynomial of the adjacency matrix is used 

59 in [27] and [28]. Once this polynomial is known, the network can achieve the consensus in 

60 a finite number of communication rounds. Unfortunately, when the topology of the network 

61 is time-varying this algorithm does not work and for large networks the computation of the 

62 polynomial can be inefficient. The approach in [29] uses a polynomial of fixed degree with 

63 coefficients computed assuming the network is known. A consensus predictor is considered 

64 in [30]. Different second order recurrences with fixed gains are used in [31], [32]. Finally, the 

65 distributed evaluation of Chebyshev polynomials for consensus has been proposed in [33], [34]. 

66 Although the convergence of some of these algorithms under switching topologies has been 

67 demonstrated in practice, to the authors' knowledge there is still a gap in the theoretical analysis 

68 of the behavior of polynomial evaluation in this case. 

69 In this paper we try to fill this gap, extending the results presented in [33] about Chebyshev 

70 polynomials and their use in consensus applications. In [33] we introduced the algorithm, based 

71 on a second order difference equation, and we studied its convergence to consensus for stochastic 

72 symmetric matrices in fixed graphs. In this paper we extend the convergence result, considering 

73 non-symmetric matrices that can have complex eigenvalues. We also provide a complete study 

74 of the parameters that make the algorithm achieve the optimal convergence rate and we give 

75 bounds on the selection of these parameters to achieve a faster convergence than using the powers 

76 of the weighted adjacency matrix. Regarding the case of switching communication topologies, 

77 we are able to theoretically show that there always exist parameters that make the proposed 

78 algorithm converge to the consensus. Experiments with synthetic data show the benefits of using 

79 our algorithm compared to other methods. 

80 The structure of the paper is the following: In section n we introduce some background 

81 about the Chebyshev polynomials and distributed consensus. In section III we present the new 

82 distributed consensus algorithm using Chebyshev polynomials. In sections IV and V we study 

83 the properties of the algorithm with fixed and switching communication topologies respectively. 

84 In section VI we analyze the behavior of the algorithm in a simulated setup. Finally in section 



85 VII the conclusions of the work are presented. In order to simplify the reading of the manuscript 

86 we have moved to an appendix some of the proofs of the theoretical results in sections III and 

87 IV. We have left in the text only the proofs that contain convenient information to follow the 

88 analysis. 

89 II. Background on Chebyshev Polynomials and Distributed Consensus 

90 In this paper we consider Chebyshev polynomials of the first kind [1]. We denote the Cheby- 

91 shev polynomial of degree n by T„(x). These polynomials satisfy 

Tn{x) = cos(narccosa;), for all x G [—1, 1], (1) 

92 and |T„(x)| > 1 when |a;| > 1, for all n eN. A more general way to define these polynomials 

93 in the real domain is using a second order recurrence, 

To(x) = 1, Ti{x) = X 

T„(x) = 2xTn-iix) - T„_2(x), n>2. 

94 By the theory of difference equations [35], the direct expression of (|2]) is determined by the 

95 roots Ti and r2 of the characteristic equation, 

^n(x) = ^(ri(x)" + r2(x)"), (3) 



96 where ri(x) = x — — 1 and T2{x) = x + y/x"^ — 1 = 1/ti{x). In the paper we take 



. X — y/x'^ — 1, if X > 

r{x)={ - , (4) 

X + yx^ — 1, if X < 

97 so that |r(x)| < 1 and |r(x)|^^ > 1 for all |x| > 1, and therefore, 

T„(x) = ^(r(x)" + r(x)-") = ^r(x)-"(l + r(x)2"). (5) 

98 It is clear that if |x| > 1, then T„(x) goes to infinity as n grows. If |x| < 1, then r(x) is a 

99 complex number with |t(x)| = 1 and |r„(x)| < 1, Vn, as stated in eq. ([T]). 

100 For the analysis in the paper, it is also convenient to describe the behavior of Chebyshev 

101 polynomials evaluated in complex numbers. For any z G C, Chebyshev polynomials, Tn{z), on 

102 the complex plane can also be expressed by ([5]) where t{z) is defined now by 

. z - - 1, if \z - Vz'^ - 11 < 1 

r{z) = < , (6) 

z + yz^ — l, otherwise 



103 and again \t{z)\ < 1 and |r(2;)|~^ > 1 for all z. However, note that Chebyshev polynomials 

104 evaluated in a complex number, Tn{z), go always to infinity as n grows. 

105 Consider now a set of agents, V = {1, . . . , A^}, with limited communication capabilities. 

106 A distributed algorithm achieves consensus if, starting with initial conditions Xj(0) G M, and 

107 using only local interactions between agents, Xi{n) = Xj{n),\/i,j G V, as n — i- oo. The 

108 interactions between the agents are modeled using an undirected graph Q = {V,£}, where 

109 £ C V X V describes the communications between pairs of agents. In this way, agents i and j 

110 can communicate if and only if G £. The neighbors of one agent i G V are the subset of 

111 agents that can directly communicate with it; i.e., J\fi = {j E V \ G £}. Initially, let us 

112 assume that the communication graph is fixed and connected. 

113 The discrete time distributed consensus algorithm based on the weighted adjacency matrix 

114 associated to the communication graph [6] is 

Xi{n) = aiiXi{n - 1) + ^ aijXj{n - 1), (7) 

115 with Xj(0) = Xj. The algorithm can also be expressed in vectorial form as 

x(n) = Ax(r2- 1), (8) 

116 where x(n) = {xi{n), . . . ,XAr(n))-^ and A = [aij] G M^^^, is the weighted matrix. 

117 Assumption 2.1 (Stochastic Weights): A is row stochastic and compatible with the underlying 

118 graph, Q, i.e., it is such that an ^ 0, ^ only if (i, j) G £ and Al = 1. 



119 Since the communication graph is connected, by Assumption 2.1 , A has one eigenvalue Ai = 1 

120 with associated right eigenvector 1 and algebraic multiplicity equal to one. The rest of the 

121 eigenvalues, real or complex, satisfy |Ai| < 1, i = 2, . . . , N. Without loss of generality, let us 

122 suppose that all the eigenvalues are simple. We denote by A2 the second largest and \n the 

123 smallest real eigenvalues and we assume that max{|A2|, |AAr|} > |Aj|, i = 3, . . . , N — 1. 

124 Any initial conditions x(0) can be expressed as a sum of eigenvectors of A, 

X(0) = Vi + . . . + Vat, 

125 where Vj is a right eigenvector associated to the eigenvalue Aj. Specifically, Vi will be of the 

126 form (wfx(0)/wf 1)1, with Wi a left eigenvector of A associated to Ai. It is clear that 

x(n) = A"x(0) = vi + A>2 + . . . + Xn^n, 



127 and since |Ai| < 1, i ^ 1, the consensus is asymptotically reached by all the agents in the 

128 network, i.e., lim„_5.oo x(n) = Vi = (wfx(0)/wf 1)1. The asymptotic convergence implies that 

129 the exact consensus value will not be achieved in a finite number of iterations. In practice, the 

130 consensus is said to be achieved when \xi{n) — Xj{n)\ < tol for all i and j, and a prefixed 

131 error tolerance tol. The convergence speed of ([8]) depends on max(|A2|, IAatI). When the size 

132 of the network is large or the number of links is small this value is usually close to one, which 

133 means that the algorithm requires many iterations before obtaining a good approximation of the 

134 final solution. 

135 When the communication topology changes with the time, Q{n) = {V, £in)}, eq. ([8]) becomes 

136 x(n) = A(n)x(n— 1), where the different weight matrices are defined according to their respective 



137 underlying communication graphs. If the different weight matrices satisfy Assumption 2.1, and 

138 the sequence of matrices is not degenerated, the algorithm is still proved to achieve consensus. 

139 We refer the reader to [6] for further information about this case. 

140 III. Consensus algorithm using Chebyshev polynomials 

141 The distributed evaluation of polynomials provides an easy way to speed up the consensus, 

142 keeping the good properties found in standard methods. The main idea consists in designing a 

143 distributed linear iteration such that the execution of a fixed number of n steps is equivalent 

144 to the evaluation of some polynomial, Pn{x), in the fixed matrix A [27], [29]. The polynomial 

145 must satisfy that P„(l) = 1 and |-Pn(a^)| < 1 if \x\ < 1. In this way, successive evaluations of 

146 the polynomial in A will lead to the consensus. The choice of the polynomial determine the 

147 convergence speed of the algorithm, given by max^. |P„(Ai)|, with Aj the eigenvalues of A. 

148 Two reasons motivate the choice of Chebyshev polynomials for the consensus problem: 

149 • By using the recurrent definition ([2]), instead of considering a polynomial of fixed degree we 

150 can evaluate Chebyshev polynomials of higher and higher degree as successive iterations 

151 of the algorithm are executed. 

152 • Chebyshev polynomials have the mini-max property [1]. This property says that, among all 

153 the monic polynomials of degree n, the polynomial 2^^"T'„(x) is the one that minimizes 

154 the uniform norm on the interval [—1, 1]. This property is indeed quite convenient for our 

155 purposes. If the matrix A is unknown, using the Chebyshev polynomials we are minimizing 

156 maxAe[-i,i] -Pn(A), therefore, getting high chances to obtain a good convergence rate. 



157 However, the monic version of the Chebyshev polynomials does not satisfy 2^~"T'„(1) = 1. 

158 In order to keep this property we perform a linear transformation of T„(x), using two real 

159 coefficients A^, ^m, with 1 > A*/ > > — 1, bringing the interval [A^,, Xm] to [—1, 1]. In this 

160 way, we define the polynomial 

T„(cx - d) 2 , Am + Am 
Pn{x) = — — , with c = — , d = — , (9) 

ln[C — a) — '^M — '^m 

161 which, for all n, has the following properties: 

162 • if a; G [A^, Aj\/], then cx — d E [—1,1] 

163 . P„(l) = 1 and P„(AAf + A™ - 1) = (-1)" 

164 • |Pn(a;)| < 1 for all x E (Am + Am — 1, 1) and |Pn(x)| > 1 otherwise. 

165 The polynomial defined in (|9]) satisfies the recurrence 

Pnix) = 2 ^-f-f {cX - rf)P„_i(x) - ^;-;^'~f p»-2(x) (10) 

Tn{c-d) Tn[c-d) 

166 and the consensus rule x{n) = P„(A)x(0) is defined by 

x(l) = Pi(A)x(0) = ^r^^(cA - rfl)x(O), 

x{n) = P„(A)x(0) = (^2^^^-^^{cA - rfI)P„_i(A) - \f^S^^ Pn-2{A)^ x(0) (11) 

= 2%r^(^A - dlHn - 1) - %f^x(n - 2), n > 2, 

167 with I the identity matrix of dimension A^. Notice that this consensus rule is well designed to 

168 be executed in a distributed fashion. 

169 When the topology of the network changes, the recurrent evaluation of Chebyshev polynomials 

170 ( [TT] ) can still be used. The time-varying version of the algorithm is equivalent to ( [TT] ) replacing 

171 the constant weight matrix A by the weight matrix at each step A(n). Although this is no longer 

172 equivalent to the distributed evaluation of a Chebyshev polynomial, a theoretical analysis about 

173 its convergence properties is still possible. Algorithm [T] shows a possible implementation of the 

174 algorithm. In the rest of the paper we analyze, both in theory and practice, the main properties 

175 of this algorithm for fixed and switching communication topologies. 



Algorithm 1 Consensus algorithm using Chebyshev polynomials - agent i 



Require: Xi{Q), Maxit e N, X„i, \m, 
1: - Initialization 

2: c = 2/{Xm — A„j); d — (Am + Am)/(AM — Am); 
3: r(0) = 1; r(i) =c-d- 
4: - First Communication Round 

= ;^^(c ^ a,jXj{0) + {cau-d)x,{0)); 

5: for 71 = 2, . . . jMaxIt do 

6: r(7i) = 2(c-rf)r(n-l)-T(n-2); 
7: - Communication Between Neighbors 

= 2— — — (c 2^ aya;j(n - 1) + (c - d)xi(n - 1)) , . Xi[n-2)\ 

8: end for 



176 IV. Analysis with a Fixed Communication Topology 

177 In this section we analyze the main properties of the proposed algorithm when the network 

178 topology is fixed. In particular we first study the convergence conditions of the algorithm. Next, 

179 we find the parameters that maximize the convergence speed. Finally, we give bounds on the 

180 selection of these parameters to satisfy that our algorithm achieves the consensus faster than ([8]). 

181 Theorem 4.1 {Convergence of the algorithm): Let A be diagonalizable, fulfilling Assumption 



2.1 and parameters Am and \m such that 1 > \m > Am > — 1. If the minimum real eigenvalue 



183 of A satisfies Aat > Am + AM — 1 and the complex eigenvalues, A^, of A satisfy \r{c\z — d)\> 



184 r(c — d), then the recurrence in eq. ( [TT] ) converges to the consensus state, lim„^oo 

185 wfx(0)l/wf 1. Besides, the convergence rate is given by 



186 Proof. See the Appendix. 



\Tn{c\i-d)\ 

max—— — (12) 



187 Note that the conditions in Theorem 4.1 are easy to fulfill without the necessity of knowing the 

188 eigenvalues of the matrix A. For the real eigenvalues, any symmetric selection of the parameters. 



i.e., —Am = Am, < \m < 1, satisfies the condition in Theorem |4.1[ The condition on the 
complex eigenvalues has some geometrical meaning [1]. Imposing that \t(c\z — d)\ > r(c — d) 
is equivalent to require that A^ is inside an ellipse in the complex plane centered at (d/c, 0), or 



192 equivalently {{Xm + Arrt)/2, 0), and with semi-axis ei = {c — d)/c and 62 = (a/ (c — d)'^ — 1)/c 

193 (see Fig [T]). In practice, any parameters that ensure convergence for the real eigenvalues also 

194 ensure convergence for the complex ones. We have observed that if A is defined using well 

195 known distributed methods [24], the complex eigenvalues, when there are any of them, have 

196 always a very small modulus. For that reason, in the rest of the section we will assume that the 
matrix A has only real eigenvalues. 

Convergence ellipse of the eigenvalues 
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Fig. 1. Ellipse where all the eigenvalues must be contained in order to achieve the consensus. In this particular example we 
have chosen A a/ ~ 0.9 and Am = —0.5. Note that when the imaginary part of the eigenvalues is zero convergence is achieved 
if Ail/ -f Am — 1 > A > 1 as stated in Theorem |4.1| 



197 

198 Next, we are interested in knowing the optimal selection of Am and \m to maximize the 



199 convergence speed. From Theorem 4.1 we know that the convergence rate is given by the factor 

200 If the conditions in Theorem 4.1 are satisfied, for any A, a simple calculation using eq. ([5]) 

201 leads to 

\Tn{cX-d)\ ^f T{c-d) y i + T{c\-dY^ 

202 It is clear that when n — )• oo, the second fraction in the right side of ([14]) goes to 1. Therefore, 

203 the convergence rate is determined by 

^ ] (15) 

\ \t{cXn - d)y |r(cA2 - d) \ ] 



204 If [Aat, A2] C [Am, A^f], then maxAj \Tn{cXi — d)\ < 1 and therefore we can define the convergence 

205 factor as 

{r(c - d), if [Xn, A2] C [Am, Am] 

r Wc-d) r(c-rf) 1 ^^^^ 
||r(X-ci)|' |r(X-rf)| I ' 

206 The optimum values of A^ and Am will be those that lead to the minimum value of z/(c, d). 

207 In [33] it was proved that among the values of the parameters satisfying [AAr,A2] C [A„,Aj\/], 

208 the ones that yield the minimum convergence factor are precisely Am = A at and Am = A2. Let 

209 us see that they are also the optimum parameters in the case [Aat, A2] ^ [A„, Am]- 

210 Theorem 4.2 (Optimal parameters): The convergence rate u(c,d) attains its minimum value 

211 for the parameters c, d such that Am = A2 and Am = A^v 

212 Proof. See the Appendix. ■ 

213 This implies that in order to achieve the maximum convergence speed, some knowledge about 

214 the network is required. However, even if the network topology is unknown, it is important to 

215 study when the algorithm converges in a faster way than ([8]). Since the symmetric assignation 

216 of the parameters. Am = —Am, always ensures convergence, in the last result of this section we 

217 provide bounds for this particular case that also converge faster than ([8]). 



218 Theorem 4.3 (Faster convergence than A"j.- For any matrix A satisfying Assumption 2.1 let 



219 



A = max(|A2|, |Aiv|) be the convergence rate in ([8]). For any 



2A 

< Am < T^-— T' and Am = -Am, (17) 
A^ + 1 



220 Pn(A) goes to zero faster than A" when n goes to infinity. Therefore the algorithm in eq. ( [TT] ) 

221 converges to the consensus faster than the one in eq. ([8]). 

222 Proof. See [33]. ■ 

223 Remark 4.4: The above result shows that there always exist parameters that make the proposed 

224 algorithm faster than ([8]). Therefore, if the algorithm is executed using the optimal parameters, 

225 it will also converge to the average faster than ([8]). 

226 Finally, a graphical comparison of x", Tn{x) and Pn{x) is depicted in Fig. [2]for n = 4, in 

227 the interval [—1, 1]. Note that T„(a;) cannot be used in the consensus process because at some 

228 points it would not reduce the error. On the other hand, as we have shown along the section, 

229 Pnix) satisfies the conditions required to achieve consensus. Also notice that P„(x) has closer 




Fig. 2. Plot of the polynomials x", Tn{x) and P„(x). In the figure n — 4, \,n = —0.95 and \m — 0.95. 



230 values to zero than x" in points close to —1 and 1, which supports the theory that the error 

231 associated to eigenvalues in that regions will be reduced faster. 

232 V. Analysis with a Switching Communication Topology 

233 We are interested now in the study of the recursive evaluation of ( [TT] ) when the topology of 

234 the network, and therefore the matrix A, changes at different iterations. Given initial conditions 

235 x(0), the distributed recurrence now looks: 



xfll 



X n 



Ti{c-d) 



(cA(l) -cil)x(O), 



Tn-i{c-d) T„_2(c-rf) 
' {cA{n) — dl)x{n — 1) 



(18) 



x(n-2), n>2. 



Tn{c-d) Tn{c~d) 

236 Note that this recurrence is suitable for switching weight matrices. However, the evaluation of 

237 the recurrence is no longer equivalent to P„(A)x(0), for some matrix A. This means that we 

238 are not exactly evaluating the transformed Chebyshev polynomials in the eigenvalues of some 

239 matrix anymore. Nevertheless, a theoretical analysis is still possible. 

240 For this analysis, the matrices X{n) now require the following assumption. 

241 Assumption 5.1 (Non-Degenerate Stochastic Weights): The matrices A (n) are row stochastic, 

242 symmetric, non-degenerate and compatible with the underlying graphs, Q{n), for all n, i.e., they 

243 are such that X{n)\ = 1, au^n) > e and aij{n) E {0} U [e, 1) with < e < 1 some fixed 

244 constant. 

245 Recalling the analysis done in the previous section, the evaluation of P„(A) was separated into 

246 the evaluation of its eigenvalues and eigenvectors, P„(Aj)Vj = T„(cAj — d) /Tn{c — d)\i. In the 



247 switching case we must take into account that both Aj and Vj change at each iteration. Moreover, 

248 since the eigenvectors of different matrices are related we must also consider these relations. 

249 For the moment, as a first simplification of the problem, let us forget about the changes in Vj 

250 and the parameters c and d and let us study the scalar evaluation of the Chebyshev recurrence 

251 (|2]) with different Aj at each iteration. That is, 

ro(A) = 1, Ti(A) = A(l), T„(A) = 2AHT„_i(A) - T„_2(A), (19) 

252 where A = {X(n)}, n G N is a succession of real numbers. Specifically, we are interested in 

253 the behavior of |T„(A)|. 

254 Proposition 5.2: Suppose there exists values Amin and Amax such that A(n) G [Amin, Amax], Vn G 

255 N, Amin < < Amax and |Amin| < Amax- ThcU 

|T„(A)| < |T„(A*)| (20) 

256 where A* = {A*(?7.)} is a succession defined by 

, , Amax if n odd, 
A*(n) = <^ (21) 

I Amin if n even, 

257 Proof. For abbreviation, in the proof we will denote the sign of T„(A) by s(T„). 

258 Let us note that, if s(T„_i) = s(T„_2), by choosing A(n) < 0, then 

|T„(A)| = |2A(n)T„_i(A) - T„_2(A)| = |2A(n)T„_i(A)| + |T„_2(A)|, (22) 

259 independently of n. The choice of \(n) > when s(T„_i) = s(T„_2) implies that 

|T„(A)| = |2A(n)T„_i(A) - T„_2(A)| < |2A(n)T„_i(A)| + |r„_2(A)|. (23) 

260 Taking these two facts into account we can see that 

s(T„_i) = s(T„_2) argmax |r„(A)| = Amin- (24) 

A(n) 

261 Besides, in this situation, choosing X{n) < yields s(T„) ^ s(T„_i). 

262 Now, if s(T„_i) 7^ s(T„_2) and X{n) > 0, then eq. (|22]) is again true. On the other hand. 



263 choosing X{n) < in this situation implies ( |23| ). Thus, 



s(T„_i) 7^ s(T„_2) argmax |T„(A)| = Amax- (25) 

A(n) 



264 Also, if s(T„_i) ^ s(T„_2) and \{n) > 0, then s(r„) = s(T„_i). 



265 Finally, noting that inequality ([20j) holds forn = and 1, and s(To(A*)) = s(Ti(A*)), then 



266 using ([24]) and ([25]) the succession (21 ) is obtained and the result is proved. 



267 Corollary 5.3: If |Amin| > Amax then the bound in eq. ( [201 ) is true taking A* = {X*{n)} with 



, , , Amax if n even, 
A*(n) = { (26) 

Amin if n odd. 



268 The previous proposition reveals that the Chebyshev recurrence evaluated in a succession of 

269 different real numbers does not keep the behavior shown when it is evaluated with a constant 

270 value. The next Lemma provides a bound for the direct expression of this behavior. 



Lemma 5.4: Let us suppose that the conditions of Proposition 5.2 are true. Then 



|T„(A*)| < Ki(Ai„ax)", where ki(A 

max J ^>max + VALx + 1 (27) 

272 Proof. Let us define the recurrence 

T*(A) = 1, t;(A) = A, t:(A) = 2AT:_i(A) + T:_,iX), (28) 

273 which satisfies that 

|T„(A*)| <T:(A^ax). (29) 

274 According to recurrence ( [28] ), the succession {T^(Amax)5 it- = 0,1,...} satisfies the homo- 

275 geneous difference equation T*(Amax) - 2AmaxT*_i(Amax) - r*„2(Amax) = 0. By the theory of 

276 difference equations [35], the solution to this equation is determined by the roots ki and K2 of 

277 the characteristic polynomial. In this case 

Kl(Amax) = Amax + V A^x + ^ > 1) ^ud K2(Amax) = A^ax " V A^x + ^ = " V'^ 1 ( ^max) • (30) 

278 Since /ti(Amax) ^ 't2(Amax), the direct expression of r*(Amax) is 

T:(An,ax) = Ak,{X^,X + BK2{Xma.r (31) 

279 where A and B depend on the initial conditions Tq (Amax) and rj'(Ainax)- In our case A = B = 

280 1/2 and 

\Tn{A*)\ < T:(A^ax) = lMXm..r + (-V/^llAmax))") < /«l(A^ax)". (32) 



282 This direct expression p7| ) will be helpful in the development of the convergence analysis 

283 dealing with changing matrices and the parameters c and d. We provide now the main result, 

284 showing the convergence of the algorithm for the switching case. 

285 Theorem 5.5: Allow the communication graph, Q{n), to arbitrarily change in such a way that 



286 it is connected for all n, with the weight matrices, A{n), designed according to Assumption 5.1 

287 Let us denote Xi{n), i = 1, . . . , N, the eigenvalues of A{n) and 

Amax = max max Aj(n), and Xmm = min min Xi{n). (33) 

n i=2,...,N n i=2,...,N 

288 Given fixed parameters c and d, a sufficient condition to guarantee convergence to consensus of 



289 iteration ( [18] ) is 

fi:i(max{|cAmax - d\, \c\^in - d\})T{c -d) <1. (34) 

290 Proof. See the Appendix. ■ 

291 The next corollaries give more specific values of A a/ and A^, and therefore on c and d, that 

292 satisfy the condition in the theorem to achieve convergence. 

293 Corollary 5.6: Assume \cXmax — d\ > \cXmm — d\ and a symmetric assignation, — A.^ = Am = 

294 A, of the parameters. Then if 

A'<(1-A^J, (35) 

295 the algorithm converges. 

296 Proof. Recall that with this assignation c = 1/A and d = 0. Substituting ki and r by their 



297 values in eq. ([34]) and doing some simplifications eq. ([35]) is obtained. 



If we prefer to assign non-symmetric values to the parameters, the following corollary provides 



299 a possible assignation that satisfies Theorem 5.5 



300 Corollary 5.7: Assume now that the values of A„iax and Amin, or some bounds, are known. 

301 If Xm and Am satisfy that 

Xm + Am = A 

max ~l~ A min, (36) 

302 and 

Aa/ — Am < a/4(1 — Amax)(l " Amin), (37) 

303 then the algorithm achieves the consensus. 



304 Proof. If we know the values of Amax and Xmm, the choice of Am and Am can be done in such 

305 a way that 

|cAmin - d\ = |cAmax - d\. (38) 

306 With this assignation we are minimizing the value of max{|cAmax — c^l, |cAmin — and therefore, 

307 the convergence condition is easier to fulfill. Clearing ([38]) yields ([36]). With this first condition, 



308 doing some, rather tedious, calculations in eq. ([34]) the second condition ( [37] ) is obtained. ■ 

309 We discuss now in detail the meaning of the theorem and its implications. 

310 Remark 5.8: Note that the theorem provides just a sufficient condition to ensure convergence. 

311 This means that although the given bounds seem very restrictive, in practice, even if we choose 

312 large values of Xm and A^, there will be convergence. Moreover, an important consequence of 



313 corollaries 5.6 and 5.7 is that, independently on the changes of the network topology, there are 

314 always parameters such that the method converges to the consensus. 

315 Remark 5.9: It is also interesting to note the different behavior of the algorithm when the 

316 topology changes with respect to the fixed case. In the latter case, in general it is better to 

317 select the parameters Xm and Am with large modulus to ensure that all the eigenvalues of 

318 the weight matrix are included in the interval [Am,AA/]. However, in the switching case, it 

319 is necessary to choose them small so that c — d is large enough to guarantee convergence. 

320 This happens because the more variation on the eigenvalues of the weight matrices, the larger 

321 Ki(max{|cAmax — d\, |cAmin " d\}) is. Therefore, the larger A^, the smaller (in modulus) Am and 

322 Am should be chosen. 

323 Remark 5.10: The analysis followed to proof convergence of our algorithm is also interesting 

324 because it can be applied to more general consensus algorithms based on recurrences of order 



325 greater than one. Given a recurrence similar to ( [T8[ ), if a scalar difference equation is found 

326 such that its solution bounds the original one in the worst case, a convergence result using 

327 the behavior of this recurrence can be obtained. To the authors' knowledge, this is the first 

328 theoretical result proving convergence of a distributed algorithm based on polynomials under 

329 switching communication topologies. 

330 Finally, we provide a discussion about the assumptions we have made to proof convergence. 

331 • Symmetric weight matrices: If the weight matrices are not symmetric, then we cannot ensure 

332 that the norm of the matrices used to change the base of eigenvectors is equal to 1. In such 



333 a case the convergence condition in Theorem 5.5 would be i^Ki(max{|cAmax — d\, |cAmin — 

334 d\})T{c — d) < 1, with K > 1 some positive constant. It is also important to remark that, in 

335 this situation, the left eigenvector associated to \i{n) is not constant anymore for different 

336 matrices. This makes the theoretical analysis of the behavior more tedious because at each 

337 iteration it is affected by these eigenvectors, which do not tend to zero with n. However, 

338 convergence can still be achieved. 

339 • Connectivity of the graphs: The assumption about the connectivity of each graph is more 

340 restrictive than in other approaches, e.g., [9], where only joint connectivity is imposed. In 

341 our analysis, if one graph is disconnected, then Amax = 1 and the sufficient condition ( [34| ) 

342 is never satisfied. This, of course, is caused because we are considering the worst case 

343 scenario, so that we can model the behavior of the Chebyshev recurrence as the nth power 

344 of some quantity. However, in practice, even if some graphs are disconnected, the errors 

345 associated to the eigenvectors associated to the eigenvalue 1 are also canceled. We show 

346 this in simulations in section |Vll 

347 VI. Simulations 

348 In this section we analyze our algorithm in a simulated environment. Monte Carlo experiments 

349 have been designed to study the convergence of the method and the influence of the parameters 

350 Xm and Am in the algorithm. 

351 A. Evaluation with a fixed communication topology 

352 In a first step we study the algorithm when the topology of the network is fixed. We analyze 

353 the convergence speed for different weight matrices, comparing it with other approaches, and 

354 the influence of the parameters A^/ and A^ in the performance of the algorithm. 

355 In the experiments we have considered 100 random networks of 100 nodes. For each net- 

356 work the nodes have been randomly positioned in a square of 200 x 200 meters. Two nodes 

357 communicate if they are at a distance lower than 20 meters. The networks are also forced to be 

358 connected so that the algorithms converge. After that, 100 different random initial values have 

359 been generated in the interval (0, 1)^, giving a total of 10000 trials to test the algorithm. 



360 1 ) Convergence speed of the algorithm: We evaluate how our algorithm behaves compared to 

361 other methods using different weighted adjacency matrices. For each communication network we 

362 have computed 4 different weighted adjacency matrices. The first one, A^^, uses the "local degree 

363 weights", the second one, A^c, uses the "best constant factor" and the third one, Aos, computes an 

364 approximation of the "optimal symmetric weights". For more information about these matrices 

365 we refer the reader to [24]. These three matrices are symmetric, for that reason we have included 

366 in the experiment a fourth non-symmetric matrix, A„s, computed by aij = 1/ (M + l) if j E MiUi 

367 and aij = otherwise. 

368 We have compared our method with the powers of the matrices using ([8]), the Newton's 

369 interpolation polynomial of degree 2 proposed in [29], N2{x) = (x — a)^/ (1 — a)^, and the second 

370 order recurrence with fixed weights proposed in [32], = (3xFn-i{x) + (1 — (3)Fn-2{x). 

371 We have used the values a = (A2 + Xn)/"^ and /3 = 2/(1 + a/1 — A^), which give the best 

372 convergence rate for the two algorithms. For the Chebyshev polynomials we have also assigned 

373 the optimal parameters A a/ = A2 and Xm = A at. We have measured the average number of 

374 iterations required to obtain an error, e = ||x(n) — (wf x(0)/wf l)l||oo, smaller than a given 

375 tolerance. 

TABLE I 

Number of iterations for different algorithms and tolerances 



Method\Tolerance 


10"^ 


10"^ 


10-* 


10"^ 


Method\Tolerance 


10"^ 


10"^ 


10"* 


10"^ 


Aw 


396.1 


899.0 


1422.9 


1902.9 




381.4 


748.9 


1120.8 


1474.5 


A? 

^bc 


470.5 


892.4 


1307.4 


1691.5 




475.7 


897.0 


1109.9 


1493.7 


A" 


390.8 


735.1 


1092.0 


1446.0 


N2{Aos) 


426.8 


792.4 


964.3 


1225.2 


A" 


308.9 


698.4 


1116.7 


1521.2 


iV2(A„.) 


302.6 


604.1 


911.5 


1216.4 




45.7 


71.9 


98.0 


124.2 




41.8 


62.2 


82.6 


103.0 




45.2 


67.4 


91.2 


114.6 


Pn{Abc) 


44.6 


66.4 


88.1 


109.9 


Fn ( Aos ) 


42.2 


62.9 


83.3 


103.6 




42.1 


62.6 


83.0 


103.4 




40.8 


63.9 


86.8 


109.8 




38.6 


57.1 


75.6 


94.1 



376 Table |I] shows the results of the experiment. For any matrix our algorithm is the one that 

377 reaches the consensus first. It is remarkable the speed up compared to the powers and the 

378 Newton method. Moreover, considering that the initial error is upper bounded by 1, note that 



379 our algorithm is able to reduce the error by five orders of magnitude (10~^) in around = 100 

380 iterations (103.0, 109.9, 103.4 and 94.1 iterations in the table), which is the size of the network. 

381 An interesting detail is that our algorithm converges faster using the "local degree weights", 

382 A/rf(103.0), and the "non-symmetric weights", A„s(94.1), than using the other two matrices 

383 (109.9 and 103.4), even though the second largest eigenvalue of the other two matrices is smaller. 

384 This behavior happens because the eigenvalues of A^c and Aos are symmetrically placed with 

385 respect to zero whereas for A^^ and A„s |AAr| < A2 (an example can be found in [24]). As a 

386 consequence, c — is larger and the algorithm converges faster. This is indeed very convenient 

387 because the "local degree weights" and the "non-symmetric weights" can be easily computed in 

388 a distributed way without global information, whereas the other two require the knowledge of 

389 the whole topology. 

390 Regarding the non-symmetric weights, we have observed that A2 is, in general, small compared 

391 to the second eigenvalue of the symmetric matrices. Since the eigenvalues of A„s also satisfy that 

392 IAatI < A2, the convergence for this matrix is the fastest. Also note that these matrices are the 

393 easiest to compute. On the other hand, when using symmetric weight matrices the convergence 

394 value is known to be the average of the initial conditions whereas when using non-symmetric 

395 weights the convergence value depends on the matrix. 

396 2) Dependence on the parameters \m and Am-' So far we have evaluated the convergence 

397 speed of our algorithm only considering the optimal parameters, which implies the knowledge 

398 of the eigenvalues of the weight matrix. However, in most situations the nodes will have no 

399 knowledge about these eigenvalues. We analyze now the convergence rates of our algorithm 

400 when it is run using sub-optimal parameters. In this case, for simplicity we have only considered 

401 A/rf in the experiment. 

402 The results are in Table |Ilj The table shows the average number of iterations required to 

403 have an error lower than lO^'^. The number of iterations is in all the cases larger than in 

404 Table |l] (62.2 iterations) but anyway, the results are in most cases also good. The only problem 

405 appears when Am + A^ — 1 > Aat because the algorithm diverges (cells with 00 in the table). 

406 Nevertheless, the number of iterations is almost always smaller than using the powers of Aid and 

407 the Newton polynomial (899.0 and 748.9 iterations in Table|I]respectively). The results compared 

408 to Fn evaluated with the optimal parameter (71.9 it. in Table |I]) seem to be poor. However, the 

409 optimal (3 requires the knowledge of A2 which, right now, we are assuming it is unknown. For 



TABLE II 



Number of iterations using sub-optimal parameters and tolerance 10^^ 





0.2 


0.5 


0.8 


0.9 


0.95 


0.999 


-0.2 


713.8 


563.7 


355.2 


CO 


CO 


CO 


-0.5 


798.1 


630.4 


397.2 


279.0 


194.5 


75.9 


-0.8 


874.3 


690.6 


435.2 


305.6 


213.1 


83.1 


-0.9 


898.3 


709.5 


447.0 


314.0 


219.0 


85.4 


-0.95 


910.0 


718.8 


453.0 


318.1 


221.8 


86.5 


-0.999 


919.4 


726.0 


457.6 


321.3 


224.0 


87.4 


F„ 


757.5 


672.4 


463.9 


320.9 


227.1 


93.0 



410 that reason, in the last row of Table |ll] we have included the results using F„ evaluated with 

411 /3 = 2/(1 + a/1 — A|^), i.e., with the same estimation of A2 used for the Chebyshev polynomials. 

412 In this case we observe again that both methods present a similar performance when using the 

413 same parameters. The degree of freedom given by is what differs in the algorithms. By 

414 adjusting this parameter we can reduce the number of iterations in our algorithm. 

415 Another advantage of using our algorithm with the weight matrix A^^, besides the computation 

416 using local information, is that usually its smallest eigenvalue, Aat, is a negative value close to 

417 zero (in our simulations it has never valued less than -0.5). The second largest eigenvalue depends 

418 on how many nodes has the network and the number of links, but in general this eigenvalue is 

419 close to one. Therefore by choosing A^ = —0.5 and Am — 1 there is a great chance to obtain a 

420 good convergence rate and almost no risk of divergence, see for example the cell in the second 

421 row and sixth column of Table |ll] (153.7). A safer choice of parameters is Am = —Am, which 

422 we know that has good convergence rates. In this case it is also convenient to choose Am — 1 

423 to ensure that all the eigenvalues are contained in [Am, Am]. 



424 B. Evaluation with a switching communication topology 

425 Let us see how the algorithm behaves when the topology of the network changes at different 

426 iterations. We start by showing the convergence in an illustrative example where the conditions 



427 of Theorem 5.5 are satisfied. After that we run again Monte Carlo experiments to analyze the 



428 algorithm in more realistic situations, where the conditions of Theorem 5.5 do not always hold. 



1 ) Illustrative Example: The communication network considered, composed by 20 nodes, is 
depicted in Fig. [3] (top left), which is connected. In order to satisfy the conditions of Theorem 5.5 



431 at each iteration we have randomly added some links to the network. In this way all the topologies 

432 remain connected and the parameters Amax and Amin correspond to the second maximum and the 

433 smallest eigenvalues of the initial weight matrix. Using the local degree weights, which return 

434 a symmetric matrix, these parameters are Amax = 0.9477 and Amin = —0.1922. Figure [3] top 



435 middle and top right depict the evolution of x{n) using ( [T8] ) with the parameters of Corollary 



436 



5.6 Xm = -A„ = 0.3190, and Corollary pJ\ Xm = 0.6274, A^ = 0.1282, respectively. 



437 The evolution of x(n) using ([8]) is shown in Fig. [3] bottom left. It is interesting to note the 

438 similarity of this graphic with the Chebyshev recurrence using the symmetric parameters given 



439 by Corollary 5.6 (top middle). Finally, to remark that the condition of Theorem 5.5 is a sufficient 

440 condition in Fig. [3] bottom middle and bottom right we show that the algorithm also converges 

441 to the consensus choosing parameters with larger modulus. In the example we have chosen the 

442 parameters using the criteria analyzed for the fixed topology situation. Moreover, we can see 

443 in the graphics that the consensus is achieved in both cases in less iterations (the lines overlap 

444 earlier in the graphics). Finally, note that the symmetry in all the weight matrices implies that, 

445 in all the cases, the value of the consensus is the average of the initial conditions. 

446 2 ) Analysis of convergence depending on the evolution of network and the parameters of the 

447 algorithm: We have generated again 100 random networks of 100 nodes like in the fixed topology 

448 case. To model the changes in the communication topology we have considered three different 

449 scenarios in the experiment. The first one assumes a fixed initial communication topology and, 

450 at each iteration the links can fail with constant probability equal to 0.05 (Link Failures). This 

451 is a usual way to model networks with unreliable or noisy communications. In the second 

452 scenario we consider a set of mobile agents that randomly move in the environment. In this 

453 way, at each iteration the communication topology evolves with the proximity graph defined 

454 by the new positions of the agents (Evolution with Motion). The last scenario assumes a new 

455 random network at each iteration (Random Network). Although in reality this situation will be 

456 uncommon, it is interesting to analyze it in order to study the properties of our algorithm. In 

457 the three scenarios we have used the local degree weights to define the weight matrix at each 

458 iteration. We have not worried about the network connectivity, letting the experiment to possibly 

459 have several iterations with disconnected networks. We have set a maximum of 3000 iterations 



Communication Network \m = -Xm = 0.3190 Xm = 0.6274, Xm = 0.1282 




n„A(n)x(0) Am = 0.8, = -0.8 Am = 0.9, A^ = -0.5 

Fig. 3. Illustrative example of the convergence speed of the algorithm with a switching communication topology. The initial 
network is shown at the top left graphic. The evolution using l[8j is shown at the bottom left and four different executions of 
^lE^ with the same changes in the topology and different parameters are depicted in the rest of the graphics. Notice that even 
when the conditions of Theorem |5.5| are not satisfied (bottom middle and bottom right graphics), the algorithm still achieves 
the consensus. 



460 per trial. 

461 Table III shows the number of iterations required by iteration ([8]) to achieve a precision of 

462 1 0^'^. We can see that when the network has link failures or evolves with the motion of the nodes 

463 the number of iterations required by the algorithm is slightly greater than when the topology 

464 of the network remains fixed (1087.2 and 1032.4 compared to 899.0 in Table On the other 

465 hand, when the network randomly changes at each step, in a few iterations (9.4) the consensus 

466 is achieved, which makes sense because in this situation the information is spread in a fast way. 



TABLE III 

Number of iterations with tolerance 10^^ 



Link Failures 


Evolution with Motion 


Random Networks 


1087.2 


1032.4 


9.4 



467 



468 The number of iterations required to achieve the same accuracy (tolerance of 10 ) using ( 18 1 



469 with different parameters is shown in Tables |IV[ |V] and |VI] for the Link Failures, Evolution with 
Motion and Random Networks scenarios respectively. 



TABLE IV 

Number of iterations for Link Failures 



Am\AM 


0.25 


0.5 


0.75 


0.9 


0.95 


-0.25 


> 3000 


> 3000 


1298.1 


383.3 


267.9 


-0.5 


> 3000 


> 3000 


1328.6 


418.9 


293.5 


-0.75 


> 3000 


> 3000 


1356.6 


452.3 


316.8 


-0.9 


> 3000 


> 3000 


1321.0 


470.9 


330.0 


-0.95 


> 3000 


> 3000 


1326.4 


476.9 


334.5 



TABLE V 

Number of iterations for Evolution with Motion 





0.25 


0.5 


0.75 


0.9 


0.95 


-0.25 


> 3000 


1738.0 


600.1 


457.2 


260.9 


-0.5 


> 3000 


1765.2 


665.6 


461.9 


306.5 


-0.75 


1726.5 


1793.5 


703.6 


506.3 


309.8 


-0.9 


1740.0 


1813.0 


708.5 


564.9 


311.0 


-0.95 


1744.5 


1818.0 


710.4 


564.9 


311.5 



TABLE VI 

Number of iterations for Random Networks 





0.25 


0.5 


0.75 


0.9 


0.95 


-0.25 


8.1 


8.3 


11.8 


25.4 


oo 


-0.5 


8.3 


8.9 


11.6 


22.3 


42.1 


-0.75 


8.7 


9.6 


11.8 


21.7 


37.8 


-0.9 


8.9 


10.0 


12.0 


21.7 


36.8 


-0.95 


9.0 


10.1 


12.0 


21.7 


36.5 



470 



471 With these results we can extract some interesting remarks. First of all, for the parameters 

472 tested in the experiment, the algorithm is convergent in almost all the cases. Only in the Random 

473 Networks the algorithm diverges when = 0.95 and X.^ = —0.25 (Table |VI] first row and 

474 sixth column). The cells with "> 3000" iterations point that for these parameters the algorithm 

475 converges but in a slow way. A second interesting detail is that, similarly to the fixed topology 

476 case, we can always find parameters that make our algorithm achieve the consensus faster than 



477 using (|8j) (results of Table III). However, it is surprising which parameters achieve this goal in 

478 the different scenarios. For the Link Failures and the Evolution with Motion, the best parameters 

479 are exactly the parameters that make the algorithm diverge for the Random Networks scenario, 

480 i.e., \m = 0.95 and Am = —0.25 with 267.9 and 260.9 iterations respectively. On the other 

481 hand, the best parameters for the Random Networks are those who give the slowest convergence 

482 rate for the other two scenarios, i.e., Aj\/ = 0.25 and A^ = —0.25 with 8.1 iterations in Table 



VI versus more than 3000 in Tables IV and M The explanation for this phenomenon appears in 



484 the variability of the eigenvectors of the weight matrices. When the topology changes arbitrarily 

485 at each iteration, there is a great variability in the eigenvectors of the weight matrices, which 

486 turns out in a great variability of x(n). This situation is closer to the worst case we have shown 

487 in section IV to proof the convergence of the algorithm. Therefore, a good convergence rate 

488 requires a large value of c — d, achieved when Aj\/ and A^ have small modulus. When the 

489 topology changes smoothly, as in the Link Failures and the Motion Evolution, the eigenvectors 

490 almost do not change and the algorithm behaves similarly to the fixed case. For that reason, the 

491 parameters that achieve the best convergence rate are the same as in the fixed case. However, 

492 we must be careful because for larger values of \m the algorithm may diverge. 

493 A final detail is that, in all the cases, the convergence seems to be more affected by Am than 

494 Am. This is explained by the use of the local degree weights. As we have mentioned earlier, 

495 these matrices do not have symmetric eigenvalues with respect to zero. In these matrices Amax 

496 dominates the convergence rate, so the convergence is more sensible to the parameter A a/. 

497 In conclusion, when the topology of the network changes, the parameters should be chosen 

498 taking into account the nature of these changes. For small changes similar parameters to the 

499 fixed case should be assigned whereas if the network is expected to change a lot we should pick 

500 small parameters for the algorithm to guarantee convergence. 



501 



VII. Conclusions 



502 In this paper we have analyzed the properties of Chebyshev polynomials to design a fast 

503 distributed consensus algorithm. We have shown that the proposed algorithm significantly re- 

504 duces the number of communication rounds required by the network to achieve the consensus. 

505 We have provided a theoretical analysis of the properties of the algorithm in both fixed and 

506 switching communication topologies. We have also evaluated our method with an extensive set 

507 of simulations. Both theoretical and empirical analysis show the goodness of our proposal. 
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Appendix 



582 A. Proof of Theorem 4. 1 



583 We introduce two auxiliary results to proof the convergence. 

584 Lemma 1.1: Given xi > 1, for any X2 such that \x2\ < Xi it holds that 



lim ^ = 0. (39) 



585 Proof. For \x2\ < 1, |T!„(x2)| < 1, Vn, and since T„(xi) — )• oo with n, eq. ( |39l ) is true. Now, 

586 if 1 < |a;2| < xi, then using ([5]) we have 

Tn{x2) r(xi)" l + r(a;2)2" 



Tn{xi) r(a;2)" l + r(a;i)2"' 
587 But in this case 1 > |'r(a;2)| > 'r(xi) > and the result holds immediately. 



(40) 



Lemma 1.2: Given x > 1, for any complex number z, such that \t{z) \ = ram{\ z+\/z^ — 1| 
Vz^ - 1|} > r{x), then lim„^oo T„(z)/T„(a;) = 0. 



590 Proof. It is a straightforward consequence of (|40]). ■ 

591 Proof of Theorem 4.1 Let Q = A — Iwf /wf 1, whose eigenvalues are 0, with Vi its correspond- 

592 ing right eigenvector, and \2, ■ ■ ■ , \n with the same eigenvectors as A. Since Vi = wfx(0)l/wf 1, 

593 then Iwf (x(0) — Vi) = 0. Taking this into account it is easy to see that 

A"(x(0) - vi) = Q"(x(0) - vi), Vn G N, (41) 

594 and therefore P„(A)(x(0) - Vi) = P'„(Q)(x(0) - Vi). 

595 Also Avi = Vi and -Pn(l) = 1, then P„(A)vi = Vi and 

||xH - V1II2 = ||P„(A)(x(0) - vi)||2 = ||P„(Q)(x(0) - vi)||2 < ||P„(Q)||2||x(0) - Vil^. (42) 



596 In addition, since A is diagonalizable, so is Q, which implies that Q can be decomposed, 

597 Q = FDF \ with D =diag(0, A2, • . • , Xn)- Using algebra rules we get that P„(Q) = PP„(D)P"^ 



598 and then 

||Pn(Q)||2 < IIPII2 piPniQ)) WP-'h = Km^x\P4X,)\ = i^max '^"^f ^ ~ f (43) 

47^1 17^1 In{c — a) 

599 with K the condition number of P. 

600 For any x E (Xm + Am — 1, 1) we have that \cx — d\ < c — d, then for all the real eigenvalues of 

601 A but Ai, \c\i — d\ < c — d. Noting that c — is strictly larger than 1 and r(c — d) < t(c\z — d), 

602 for any complex eigenvalue A^, by Lemmas |l.l| and 1.2 p„(Aj) — )• for alH 7^ 1, which proves 

603 the convergence of the algorithm. 



605 B. Proof of Theorem 4.2 



606 In order to proof Theorem 4.2 we will use the following auxiliary results. 

607 Lemma 1.3: Let A^, such that [Aa^, A2] ^ [A^, Xm] and IcAat — d\ < 0X2 — d. Then, for 

608 fixed c, z/(c, d) is a decreasing function of d. 

609 Proof. Let us see that 9z/(c, d)/dd < 0. 

r(c — d) t{c — d) 



z/(c, d) 



> 



10 Then 



\t{cX2 — d)\ t{cX2 — d) 
du —t'{c — d)T{cX2 — d) + t{c — d)T'{cX2 — d) 



dd t{cX2 — dy 

11 But since for x > 0, t'{x) = —t{x)/\/x'^ — 1, then 

du t{c — d) 



dd t{cX2 — d) 



1 



^{c-dY-l V(cA2-rf)2-l_ 

612 which is negative because 1 < {0X2 — df' < {c — d)"^ . ■ 

613 Lemma 1.4: Let A^, Am such that [AAr,A2] ^ [Xm-,^M] and |cAiv — d\ > \cX2 — d\ with 

614 cXn — d < 0. Then, for fixed c, z/(c, d) is an increasing function of d. 

615 Proof. Let us see that (9z/(c, d)/dd> 0. 

r(c — d) t{c — d) 



i/(c, d) 



> 



Then 



\t{cXn — d)\ —T{cXN — d) 
dv t\c — d)T{cXN — d) — t{c — d)T'{cXN — d) 



dd 



~{cXn — dy 



17 But since, for x < 0, r'(x) = r(x) — 1, then 



du t(c — d) 



1 

+ 



dd —t{cXn — d) 

618 which is positive. ■ 

619 Proposition 1.5: Let A^, Am such that Am — A^ = 2/c is fixed and [AAr,A2] ^ [Am,AM]- 

620 Then 

621 i) If A2 — Aat > Am — Am, i^{c,d) > iy{c,d*), d* being the value such that AA/ + Am = A2 + AAr, 

622 that is, for a fixed c, u{c, d) is minimum when A^, Am are symmetrically placed with respect 

623 to AjV, A2. 

624 ii) If A2 — Aat < Am — Am and \m < A2 then i^(c, d) > v{c, d*), d* being such that Xm = A2, 

625 and in this case [AAr,A2] C [Am, Am] 

626 iii) If A2 — Aat < Am — Am and Am > Aat then z/(c, d) > z/(c, d*), d* being such that Am = A^v, 

627 and in this case [AAr,A2] C [Am, Am] 

628 Proof. 



630 



633 



636 



629 i) The result follows from Lemmas 1.3 and 1.4 If A2 > Am, then CA2 — d > \cXn — d\ 
and h'{c,d) is a decreasing function of d = (Am + Am)c/2 which means that it decreases 
as Am increases. The maximum value of Am for which these conditions hold is Am = 

1/c + (A2 + Aiv)/2 for which CA2 — d = \cXn — d\. 

If Aat < Am, then CA2 — d < \cXn — d\ and ij{c,d) is an increasing function of = 
(Am + Am)c/2 which means that it increases when Am increaseses. The minimum value of 
Am for which these conditions hold is Am = l/c+(A2+AAr)/2 for which cA2—(i = \cXis[—d\. 

ii) In this case CA2 — d> \cXn — d\, and z/(c, d) is a decreasing function of d = (Am + Am)c/2 
which means that it decreases when Am increases. The maximum value of Am for which 
these conditions hold is Am = A2. 

iii) In this case CA2 — d< \cXn — d\, and z/(c, d) is an increasing function of d = (Am + Xm)c/2 

640 which means that it increases when Am increases. The minimum value of Am for which 

641 these conditions hold is Am = Aat- 

642 ■ 

643 And finally, we are able to proof the theorem. 

644 Proof of Theorem 4.2 If [A2, Aat] C [Am, Am] the result was proved in [33]. Let us suppose 

645 then that [A2, X^] ^ [Am, Am] - If A2 — Aat < Am — Am, it has been shown in Proposition 1.1 that 



639 



646 z/(c, d) has smaller values for c, d such that [Xn, A2] C [Am, Aj\/], and in this case A2 = AAf and 

647 \n = Am yields to the minimum z/(c, d). 

648 If A2 — Aat > Am — Am, we have seen in Proposition 1.1 that ^{c, d) is smaller for c, d such that 

649 Am, Am are symmetrically placed with respect to Aat, A2, that is, Am = A2 — a and Am = Xn + ol, 

650 a > 0. Let us see that z/(c, d) is minimum for a = 0. First, note that 

2 2 . . Am + Am. A2 + Atv 



651 



Thus 



— ^ and d — "^"^ — 

Am — Am A2 — Aat — 2a ' Am — Am A2 — \n — 2a; 

r(c — d) t{c — d) 



1/(0, d) 



t(cA2 — d) —t{cXn — d) 



652 and taking into account that 

d 



rrA -d\= 2 2A-A2-AAr ^ c\-d 

da^ ^ (A2-A7v-2«)2 (A2-A^-2«)^ 



d z/(c, d) 



-2r(c - d) 



d a t(cA2 — d){X2 — Xn — 2a;) 



c — d 



cAo — (i 



y(c -rf)2-l ^/(cAs -dy-1 
653 Then z/(c, (i) is increasing with a and the minimum value is obtained for a = 0. 



> 0. 



654 



C. Proof of Theorem 5.5 



655 First of all, let us state the notation we will follow along the proof. For any weight matrix A(n) 

656 we denote its eigenvectors by \i{n), i = 1,. . . ,N. Let us denote \{n) = [vi(ri), . . . ,V7v(^)] 

657 the matrix with all the eigenvectors of A{n). Thus, A(77,)V(n) = Y{n)D{n), with D(n) = 

658 diag(Ai(?2), . . . , AAr(n)). Since A(n) is symmetric, it is diagonalizable and we can choose the 

659 base of eigenvectors in such a way that V(n) is orthogonal. Therefore, Vi(n)^Vj(n) = 0,Vi = 

660 2, . . . , A^, and \i{n) = 1/\/N = Vi, for all n. 

661 Let Q(n) = A{n) — whose eigenvalues are 0, with \i{n) = 1/VN its corresponding 

662 eigenvector, and A2(n), . . . , Xwin), with the same eigenvectors as A{n). Taking all of this into 

663 account it is easy to see that ll^(x(0) — (l^x(0))vi) = 0, and 

A{n){x{n) - (l^x(0))vi) = Q{n){x{n) - (l^x(0))vi). (44) 

664 Given two consecutive matrices, Q{n) and Q(n — 1), let P(n) be the matrix such that \(n — l) = 

665 V(n)P(n), that is, the matrix that changes from the base of eigenvectors of Q(n — 1) to the 



666 base of eigenvectors of Q{n). In a similar way, K{n) will be such that Y{n — 2) = V(n)R(n). 

667 The orthogonality of V(n), implies that the matrices P(n) = \{n)^^Y(n — 1) and R(n) = 

668 \(n)^^\(n — 2) are also orthogonal, and ||P(n)||2 = ||R(n)||2 = 1. 

669 Recalling the Chebyshev recurrence ( [T8| ), we define the error at iteration n by x(n) — (l^x(O) )vi . 

670 The equivalence 

V, = 2 i''-<;"'' (cA(n) - dl)v, - ^^^v,. (45) 

671 allows us to express the error by e(?7.)/T„(c — d), with e(0) = x(0) — (l^x(0))vi, e(l) = 

672 (cQ(l) - c/I)e(0) and 

e(n) = 2(cQ(ri) - dl)e{n - 1) - e(n - 2). (46) 

673 Each vector e{n) can be expressed as a linear combination of the eigenvectors of Q{n), 

N 

= = V(r2)Q;(n). (47) 

i=l 

674 Replacing e(n) by (|47]) in (|46j), 

e(n) = 2(cQ(n) - rfI)V(r2 - l)a(n - 1) - \{n - 2)a(n - 1) 
= 2(cQ(n) - rfI)V(ra)P(ra)cK(ra - 1) - \{n)R{n)a{n - 2) 

(48) 

= 2V(n)(cD(n) - rfI)P(n)a(n - 1) - Y{n)R{n)a{n - 2) 

= V(n)[2(cD(n) - dI)P{n)cx{n - 1) - R(n)a(n - 2)] = \{n)a{n). 

675 Therefore, the vectors (x{n) satisfy the recurrence 

a{n) = 2(cD(n) - dl)F{n)ct{n - 1) - R{n)a{n - 2), (49) 

676 with Q:(0) = cx(l). 

677 Taking spectral norms, 

||a(n)||2 = \\2{cD{n) - dl)P{n)(x{n - 1) - R(n)a(n - 2)||2 < 

< 2||(cDH - dI)||2||PH||2||a(n - 1)||2 + ||RH||2||a(n - 2)||2 < (50) 

< (2max|cA,(n) 1)||2 + ||a(n-2)||2). 



678 By Lemma 5.4 we can bound the norm of ||Q;(n)|| by 

< /s:i(x^ax)"||a(0)||, (51) 



679 where the parameter a^max in this case is 

- d\ = max{|cA2(?T.) — d\, \cXN{n) — d\} = 

(52) 



a^max = max max \cXi{n) — d\ = msix{\cX2{n) — d\,\cXN{n) — d\} 

n i=2,...,N n 



= max{|cAmax - d\, |cAmm - C^l}- 

680 Therefore, in order to make the error go to zero we require that 



fi;i(Xniax) „ /CJX 

hm — — = 0. (53) 

n^oo i„(^C — d) 



681 Using ([5]) 

'^l(a^max)" /^l (a^max)"T(c (i)^ 



(54) 



Tn{c-d) l + r(c-rf)2" ' 

682 which goes to zero if Ki(a;max)T(c — d) < 1. When this happens lim„_s.oo x(?7,) = (l^x(0)/l^l)l 

683 and the consensus is achieved. 



